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ABSTRACT 


A  compiler  generation  system  is  described  which  is  rigorously 
based  and  which  allows  formal  specification  both  of  the  source 
(procedure  oriented)  languages  and  of  the  object  (machine  oriented) 
languages.  An  Intermediate  or  "buffer"  language,  BASE,  is  Interposed, 
reducing  the  required  transformation  techniques  described.  The  system, 
so  far,  Includes  those  elements  in  BASE  necessary  to  produce  ALGOL, 
FORTRAN,  and  JOVIAL  compilers. 
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1.  INTRODUCTION 


This  paper  reports  on  a  recently  developed  compiler  generation  system 
which  is  rigorously  based,  and  which  allows  formal  specification  both  of 
source  (procedure- oriented)  languages  (POLs)  and  of  machine  languages 
(MLs).  Concepts  underlying  the  system  are  discussed,  an  example  cor¬ 
relating  source  language  specification  with  system  operation  is  given,  and 
the  status  and  potentialities  of  the  system  are  discussed. 

The  crucial  problem  of  compiler  generation  is  the  characterization  of 
procedure-oriented  languages;  the  process  is  of  limited  use  unless  such 
characterization  allows  machine-independent  processing  of  programs  in 
these  languages  (and  hence  allows  invariance  of  the  language  itself  from 
machine  to  machine).  Our  solution  interposes  between  POL  and  ML  a 
"bvilfer"  or  "intermediate"  language,  called  BASE,  thus  reducing  the 
required  POL—*'  ML  transformation  to  two  logically  independent  subtrans¬ 
formations: 

(1)  POL  -  *  BASE  (called  compilation) 

(2)  BASE  —  *■  ML  (called  translation). 

This  arrangement  isolates  questions  of  POL  characterization  within  the 
first  transformation,  and  questions  of  ML  characterization  within  the  second 
transformation.  BASE  itself  is  an  expandable  set  of  non- machine- specific 
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operators  ,  declarators,  etc.  ,  expressed  in  a  uniform  "functional" 

"macro"  notation;  the  meaning  or  intent  of  such  operators  is  arbitrary  insofar 
as  the  compilation  transformation  is  concerned.  The  POL  — BASE  trans¬ 
formation  may  then  be  regarded  as  a  machine- independent  conversion,  from 
a  grammatically  rich  format  to  a  simple  linear  format. 

2.  THEORETICAL  BASIS 

Within  our  system,  a  POL  is  characterized  principally  by  a  grammar 
(i.  e.  ,  set  of  syntactic  productions),  and  the  consequent  processing  of  programs 
in  the  POL  is  syntax-driven.  To  assure  adequacy  with  respect  to  completeness 
ambiguity  M.  and  finiteness  of  analysis,  our  syntactic  method  is  rigorously 
based.  A  grammatical  model  (the  analytic  grammar)  was  developed  H. 
which  provides  a  rigorous  description  of  syntactic  analysis  via  formalization 
of  the  notion  of  a  scan.  Within  this  model,  the  selection  process  of  a  scanning 
procedure  can  be  precisely  stated,  and  thus  made  amenable  to  theoretical 
investigation.  Some  characteristics  of  this  model  are: 

•  all  analytic  languages  are  recursive 

•  all  recursive  sets  are  analytic  languages 


1.  Each  BASE  operation,  declarator,  etc.  ,  consists  of  a  three-letter 

operation  code  followed  by  n>  l  operand  type  specifier /operand  pairs 

S./X.;  e.g.  .  FFF  (S./X. . S  /X  ). 
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•  all  phr.tM'  structure  grammars  are  analytic  grammars 

•  there*  is  a  simple  sutficient  condition  under  which  an  analytic 
grammar  provides  unique  analyses  for  all  strings. 


The  grammar  in  a  POL  specification  permits  certain  abbreviations 
and  orderings  of  productions  (for  convenience,  brevity,  and  efficiency),  but 
is  nevertheless  equivalent  to  a  grammar  using  the  simple  scan^  of  M  • 
(An  equivalent  grammar  using>e£^  is  obtainable  via  a  simple  construction.  ) 


Context-sensitive  productions  may  be  used.  Our  method  guarantees  unique¬ 
ness  of  analysis  -  it  is  impossible  to  embed  syntactic  ambiguity  in  a 
language  specification.  A  simple  test  ensures  finile  analyses  for  all  strings. 
Such  a  grammar  is  at  least  as  inclusive  as  the  context-sensitive  phrase 
structure  grammar,  arid  there  does  not  appear  to  be  any  grammatical 
structure  which  cannot  be  accommodated  (grammars  of  ALGOL,  JOVIAL, 
and  FORTRAN  were  obtained  without  difficulty). 


In  fact,  such  grammars  are  sufficiently  powerful  to  accommodate  the 
notions  of  "definition"  and  "counting"  (cf.  [7]  and  the  examples  of  [«]>. 
but  to  actually  do  so  is  neither  efficient  nor  expedient.  Therefore,  a  POL 
characterization  includes  description  of  pertinent  "internal  operations  " 
(sec  the  example  in  this  paper). 

3.  Sit  STEM  OVFRVIKW 

An  overview*  of  the  generation  system  is  shown  in  Figure  1.  Using 


Best  Available  Copy 
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Figure  1.  Overview  of  System 

this  system,  the  transformation  from  a  source  language  L  to  a  machine 
language  M  is  achieved  as  follows; 

A  specification  of  L  -  an  abstract  description  of  the  syntactic  structure, 
"internal  processing  rules,"  and  "output  code"  for  L  -  is  written.  This 
specification  is  processed  by  the  compiler  generation  system  to  produce  a 
tape  of  L  -  a  set  of  data  tables  corresponding  to  the  specification.  The 
compiler  for  L  is  then  formed  by  conjunction  of  the  tape  of  L  with  a  compiler 
model  program,  a  table-directed  processor  which  acts  simply  as  a  machine 
for  interpreting  the  tape  of  L. 
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Similarly,  a  specification  of  M  is  written  designating  macro- expansions 
appropriate  to  M.  This  specification  is  processed  by  a  translator  generation 
system  to  produce  a  tape  of  M  -  data  tables  containing  the  specified  macro¬ 
expansions.  The  translator  for  M  then  formed  by  conjunction  of  this  tape 
with  a  translator  model  program,  which  expands  BASE  operations  to  sequences 
of  instructions  in  M  as  directed  by  the  tape  of  M. 

4.  COMPILATION  SYSTEM  DATA  BASE 

Processing  of  input  strings  (POL  programs)  by  a  generated  compiler 
is  intended  to  occur  in  two  parts; 

(a)  preliminary  conversion  of  "raw"  input  symbols  to  yield  a  "syntactic" 
or  "construct"  string,  which  represents  the  raw  input  for  all  further  processing 
and  then 

(b)  step-by-step  syntactic  analysis,  and  (at  each  analysis  step)  per¬ 
formance  of  prescribed  sets  of  internal  operations,  prescribed  output  of 
"code  blocks,  "  output  of  diagnostic  messages,  and  (if  desired)  performance 
of  additional  auxiliary  processes. 

The  internal  operations  in  a  POL  specification  assume  a  set  of  data 

entities  (the  "data  base"),  which  are  later  manipulated  as  prescribed  by  a 

generated  compiler.  Each  entry  of  the  construct  string  (which  represents 

the  raw  input  fluring  processing)  contains  a  construct  (or  syntactic  type  or 

token)  and  an  associated  datum,  which  is  originally  derived  from  the  raw 

input,  but  may  be  internally  altered.  The  use  of  appropriate  string  handling 
routines  allows  effectively  a  construct  string  of  unbounded  length.  Other 

data  entities  are: 
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(a)  a  set  of  function  registers  F^,  for  storage  and  manipulation 
of  "temporary"  numeric  data 


(b)  a  set  of  symbol  registers  S.,  for  manipulation  of  symbol  strings. 

(c)  a  property  table  of  integer  properties  P.(J),  for  storage  and 

manipulation  of  numeric  data  (e.  g.  ,  number  of  dimensions) 

associated  with  "variables"  in  the  input  string.  "Names"  (i.e.  . 

contents  of  symbol  registers)  can  be  "defined"  to  the  table  to 

reserve  table  entries  for  associated  data,  and  the  table  can  be 

"searched."  Defined  names  are  placed  in  a  property  table  index, 
th 

The  J  table  entry  consists  of  four  properties  P^(.T),  Pj(J), 
P^(J),  P^(J).  By  convention,  Pq(J)  ' -  the  syntactic  class  of  the 

corresponding  defined  name. 

See  Figure  2  for  further  details. 

GET  MORE  STRING 
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INPUT 

DATA 


INPUT  AND 
PRELIM 
CONVERSION 


CONSTRUCT 
l  STRING  i 


SYNTAX  PROC: 

REWRITING  OF 
CONSTRUCT 
STRING 
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POL  SPECIFICATION  AND  COMPILATION  SYSTEM  OPERATION 
The  relation  between  a  POL  specification  and  the  consequent  com- 
pilation  system  processing  is  best  shown  via  an  example.  Figure  3  shows 
a  specification2  of  the  language  LEMMA2  (first  exhibited  in  Lemma  2  of  [2] 

*  T  ITLEIL EMMA2) 

*  SYMBOLS 


(1)A 

I  A) 

(1) 

1 1)8 

(8) 

(2) 

(1)C 

1C) 

(3) 

til* 

(END) 

(01 

ID 

(NULL) 

(C) 

1 1  EDO) 

(NULL ) 

(0) 

•  ENO  SYMBOLS 

•  CVNTiV 

001  IAKDMKI  —(B) 

002  (B)(  AI(A)  xlSMKKA) 

003  (BHAHBI  *•(()) 

004  ( B  )  I B  I  IK  )  **(B)  t  X > ( K f 

005  leMOHBI  **101 

006  I  END  )(A)(Q)(C)(C)(C)(  END ) **( 2 ) 

007  (Xl(B)  **(K ) (B ) 

00R  (X) IK)  **l X) (B) 

•  END  SYNTAX 

•  INTERNAL  FUNCTIONS 

001  RTV  F3  -2 

INC  F3  l 

AS0  -3  F3 

003  SET  F5  1 

SET  F6  2 

RUT  SI  V0I-1) 

SUF  SI  VOIOJ 

OEF  SI  ((A)) 

ASO  0  FO 
SET  P 11  FO )  l 
005  INC  F5  1 
MPY  F6  2 
006  PRN  1  SI 

•  END  INTERNAL  FUNCTIONS 
«  CODE 

003  BEGIV/RIO)) 

PWRIC/F6) 

005  PWRIC/F6) 

006  AAAIC/RI-5 )) 

BBBIC/FJ) 

•  ENO  CODE 

•  DIAGNOSTICS 

0001  •*••••«  ENO  OF  SAMPLE  ANALYSIS  *•••• 
«  END  DIAGNOSTICS 

•  END  DATA 


Figure  3.  Specification  of  Che  LEMHA2 
Language 


2.  The  specification  is  shown  in  "reference"  format,  which  differs 
trivially  from  the  format  used  in  machine  processing  of  specifications. 


which  consists  of  sentences  having  the  form 

•aVaVccc- 

k 

where  X  signifies  a  sequence  of  k  X's.  Some  sentences  of  LEMMA2  are 

'AABBBAABBBCC 1 

'AAAABBAAAABBCCC' 

'AAABBBBAAABBBBCCC' 

The  specification  contains  five  sections: 

(1)  Symbols  -  specifies  the  preliminary  conversion  of  input  symbols 
and  "reserved  words"  to  construct  string  entries 

(2)  Syntax  -  a  set  of  syntactic  productions  for  use  in  syntactic  analysis 

(3)  Internal  Functions  -  the  internal  processing  to  be  carried  out  at 
each  analysis  step 

(4)  Code  -  the  sequences  of  codes  to  be  output  at  each  analysis  step 

(5)  Diagnostic  Messages  -  a  set  of  messages  for  output 

The  sections  containing  internal  functions,  code  and  diagnostic  messages 
are  unnecesary  in  defining  the  language  structure,  but  have  been  added  to 
illustrate  these  mechanisms.  The  codes  BEG,  PWR,  AAA  and  BBB  appearing 
in  the  code  section  were  invented  expressly  for  this  example;  arbitrary 
BASE  operation  codes  may  be  designated  at  will,  since  these  codes  are 
merely  transmitted  during  compilation.  The  following  discussion  can  be 
correlated  with  Figure  4,  which  shows  the  compilation  analysis  trace  for 
a  LEMMA2  program,  together  with  resulting  values  of  function  registers 
and  code  output  at  each  analysis  step. 
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The  conversion  specified  in  the  Symbols  section,  of  raw  input  symbols 
to  construct  string  format,  is  performed  specifically  to  eliminate  dependency 
of  processing  on  particular  machine  character  sets  and  hollerith  codes.  A 
construct  string  entry  containing  a  construct  and  an  associated  datum 
replaces  each  input  symbol  (or  symbol  sequence  constituting  a  reserved 
word);  Figure  5  illustrates  this  process.  An  arbitrary  numeric  or  hollerith 
datum  may  be  specified.  Data  from  the  construct  string  may  be  used  to 
construct  symbol  strings  (names),  but  this  usage  is  not  dependent  on  the 
specific  hollerith  codes  which  are  used. 


SYMBOLS  SECT 
OF  SPECIFICAT 


•  mimiMNAii 

•  SrKIOLS 

IDA  (At 

III 

llll 

III 

111 

me 

ICI 

III 

in* 

KIWI 

101 

in 

twin 

lot 

1 IE0C11 

INCLLI 

101 

•  HO  SrXIOlS 

Figure  5.  Preliminary  Symbol  Conversion 

•  The  number  in  parentheses  on  the  left  indicates  the  number  of  characters 
comprising  the  reserved  word.  The  symbols  of  the  reserved  word  follow. 

•  A  construct  (e.  g. ,  (END) )  is  specified  for  each  symbol  or  reserved  word. 
Use  of  the  construct  (NULL)  specifies  that  no  construct  string  entry  is  to 
be  made;  thus  "blanks"  are  ignored  above. 

•  A  datum  is  specified  for  each  symbol  or  reserved  word.  Either  a  numeric 
datum  (e.  g. ,  (3) )  or  a  hollerith  datum  (e.  g. ,  ( (h),  where  h  is  the  desired 
hollerith  datum)  may  be  specified. 

•  The  special  notation  ((EOC))  denotes  the  "end  of  card  symbol",  which  in 
many  languages  is  regarded  as  a  punctuation  mark.  A  representation  of 
((EOC))  must  be  given  in  every  Symbols  section. 


The  syntactic  productions  in  a  specification's  Syntax  section  are  applied 
(as  determined  by  the  compiler  model's  scan)  to  "rewrite"  the  construct 
string,  in  a  step-by-step  fashion  (see  Figure  6).  The  succession  of  these 
rewritings  constitutes  the  syntactic  analysis  of  the  construct  string.  In 
selective  productions  from  the  set  of  Figure  6,  the  compiler  model  uses  the 
"leftmost"  scan^C^  of  [•].  i-:  ,  at  each  step  the  production  chosen  is  the 
one  whose  "left  side"  occurs  first  (leftmost)  in  the  construct  string.  Thus 
at  the  first  analysis  step,  the  substring  chosen  is  BAA;  at  the  second,  ABK; 
and  so  on.  To  allow  explicit  reference  to  the  data  which  accompany  the 


Figure  6.  Syntactic  Analysis 


constructs  of  the  substring  chosen,  a  scan  position  is  defined  (at  each  step) 
to  occur  at  the  last  (rightmost)  construct  of  the  selected  substring  (see 
Figure  6). 

At  each  analysis  stop,  internal  operations  associated  with  the  selected 

production  are  performed;  function  registers  or  properties  within  the 
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property  table  may  be  set,  used,  or  arithmetically  manipulated;  character 
strings  may  be  placed  in,  prefixed  to,  or  suffixed  to  symbol  registers,  and 
so  on.  The  Internal  Functions  section  (see  Figure  7)  consists  of  sequences 
of  internal  functions  operations.  The  first  operation  of  each  sequence  has 
the  label  of  the  production  for  which  action  is  taken.  Thus  the  sequence 
RTV  F3  -2,  etc.  ,  is  performed  each  time  production  001  is  selected. 


CONSTRUCT  STRING 


Figure  7.  Performance  of  Internal  Functions 

•  SET  F5  l  places  the  value  1  in  fhe  function  register  F5 

•  PUT  SI  V0(-l)  places  the  datum,  (regarded  as  hollerith)  from  construct 
string  position  (-1)  -  relative  to  the  scan  position  -  into  the  symbol 
register  SI.  All  previous  contents  of  SI  are  deleted. 

•  SUF  SI  V0(0)  suffixes  to  the  string  in  SI  the  datum  from  construct  string 
position  0. 

•  DEF  SI  ((A))  "defines"  the  string  in  SI  to  the  property  table:  a  property 
table  entry  (say  the  n^1)  is  reserved,  the  string  in  SI  is  entered  into  the 
property  table  index,  together  with  the  entry  number  n.  The  number 
representing  the  construct  (A)  is  placed  in  P0(n),  and  n  is  placed  in  F0. 

•  ASO  0  F0  "associates"  the  value  in  F0  with  the  construct  in  string 
position  0;  the  value  F0  is  placed  in  the  datum  of  position  0. 

•  SET  PI  (F0)  1  places  the  value  l  in  P1(F0),  i.  e.  ,  in  Pl(n). 
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Care  has  been  taken  in  formulating  the  internal  operations  to  achieve 
economy  of  means  ■  simple  operations,  a  minimum  of  system  data  entities, 
and  a  minimum  of  compiler  model  machinery.  Such  a  formulation  allows  a 
simple  compiler  model  program,  while  language  complexities  must  be 
expressed  within  the  language  specification.  Some  anomalies  of  notation 
still  remain  from  our  earlier  efforts,  but  it  is  planned  to  revise  and  clarify 
notation. 

Operation  sequences  pertaining  to  different  productions  are  independent 
of  each  other,  since  there  is  no  "GOTO"  operation  (a  "skip  forward"  is 
sometimes  permitted).  Thus  a  finite  sequence  of  operations  is  performed 
at  any  analysis  step. 

Code  may  be  output  at  any  analysis  step.  Operation  codes  and 
operand  type  specifiers  given  in  the  Code  section  (see  Figure  8)  are  merely 
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COOT  OUTPUT  FOR  PROO  003  AT  THIS  TIM£  IS  |  BBS  MW 

I  pwRic/a 

Figure  8.  Output  of  a  Cede  Isputcs 


transferred  to  the  output,  while  operands  are  inserted  as  specified. 

The  Diagnostic  Message  section  contains  a  set  of  messages,  which 
are  output  by  PRN  internal  operations.  The  operation  PRNi  SI,  which  is 
executed  for  production  006,  prints  message  001  and  the  contents  of  SI. 

6.  TRANSLATION  AND  MACHINE  SPECIFICATION 

A  translator  for  a  given  target  machine  (ML)  produces,  from  an 
input  program  of  BASE  operations,  an  equivalent  program  in  the  target 
assembly  language,  in  a  format  acceptable  to  the  target  assembler.  The 
production  of  assembly  language  guarantees  compatibility  of  the  object 
program  with  the  machine's  monitor  system,  and  allows  the  assumption  in 
translation  of  system  subroutines  and  macros. 

A  BASE  program  contains  generalised  item  declarators,  array 
declarators,  etc.  ,  and  generalised  computation  operators  (e.g.  .  ADD,  SUB). 
Since  data  definition  is  explicit,  the  BASE  computation  operators  do  not  take 
account  of  the  data  types  involved  in  the  operations.  Thus  for  each  compu¬ 
tation  operation,  there  is  an  equivalent  set  of  standard  suboperations:  e.g,  , 
corresponding  to  ADD  art*  the  standard  suboperations 
"add  a  floating  item  to  a  fixed  item" 

"add  a  fixed  item  to  a  floating  item" 

and  so  on.  Determination  of  the  specific  suboperation  required  for  a  given 
BASE  operation,  taking  into  account  the  data  types  involved,  is  performed 
witin  the  translator. 
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Translation  thus  occurs  in  two  parts: 

(a)  analysis  of  BASE  operations  by  an  analysis  section,  to 
derive  equivalent  sequences  of  standard  suboperations, 
followed  by 

(b)  expansion  of  the  standard  suboperations  by  a  macro- processor 
section,  to  produce  assembly  code. 

A  machine  specification  defines  expansions  of  the  standard  suboperations. 
In  other  words,  it  defines  for  each  standard  suboperation  an  equivalent 
sequence  of  assembly  language  instructions.  Embedded  in  these  expansions 
are  format  specifiers,  which  cause  the  appropriate  format  to  be  generated. 

A  machine  specification  is  processed  by  the  translator  generation  system  to 
produce  corresponding  data  tables,  which  are  combined  with  the  translator 
model  program  to  form  the  desired  translator.  These  data  tables  direct  the 
expansions  performed  by  the  translator's  macro-processor. 

Parameters  required  by  the  expansions  are  furnished  by  the  translator's 
analysis  section  via  a  communication  table,  from  which  they  are  retrieved 
as  necessary  by  the  macro- processor  section.  Within  a  machine  specification, 
parameters  are  specified  via  position  in  this  table. 

Our  f  resent  machine  specification  notation  is  processor-oriented,  and 
not  easily  readable:  however,  it  is  planned  to  formalize  this  notation.  Some 
typical  macro  definitions  are  shown  in  Figure  9.  in  a  contemplated  notation, 
as  an  illustration  of  the  features  provided  in  a  machine  specification. 
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L0ADIN6  ACC  AND  MQ  WITH  DOUBLE  PRECISION  OR  COMPLEX  NUMBER  (FOR  CDC  1604): 

TAB  TO  START  OUTPUT 

Of  NtXT  FIELD>.  TAB.  / PARAMETER  END  MACROv 

BEGIN  MACRO^N.  \  /  2  \ 

0506,  0507  MACRfV  3NI|A>2/P1,P3  W  3HLB8/P2/P1,2H  +  1,P3  5) 

^•OUTPUT  END  OF  RECORD: 

LITERAL  OUTPUT  CARO  & 

"LDA"  CLEAR  IMAGE  ■ 


LOAD  ACCUMULATOR  (FOR  IBM  7094  FAP): 


SUMMARY  OF  NOTATION 

(  BEGIN  MACRO 
,  OPTIONAL  PUNCTUATION 
/  TAB  TO  START  OF  NEXT  FIELD 
S  "END  OF  RECORD"  MARK 
Pn  PARAMETER  n 


TEST  PARAMETER  2  CONTINUE  HERE 

^^IF  PAR  2  EMPTY^-END  MACRO 

MACRO  (  /3HClA/P1,TO2j)l  itlf,  ,P2$1 

CONTINUE  HERE  IF  ^END  MACRO 

PAR  2  NOT  EMPTY 


C(n,K)  CONDITIONAL  EXPANSION: 

SKIP  K  "i"  IF  PARAMETER  n  NOT  EMPTY. 


)  END  OF  MACRO 

nH  LITERAL  STRING  OF  n  CHARACTERS 


Mfn)  CALL  ON  MACRO  n 


T 14178 


Figure  9.  Some  Typical  Macro  Definitions 

The  translator  model  program,  except  possibly  for  one  output  pro¬ 
cedure,  is  machine- independent.  The  analysis  of  BASE  operations  is 
dependent  only  on  the  operator,  accumulator  data  type,  and  operand  data 
type  involved,  while  macro  expansion  is  table-driven.  All  dependency  on 
the  target  machine  is  isolated  within  the  data  tables  used  to  direct  expansions. 
Assembly  code  is  output  in  the  form  of  80  column  card  images,  which  are 
almost  universally  acceptable  by  target  assemblers.  Unusual  cases  might 
require  simple  modification  of  the  output  procedure. 


7.  CONCLUSIONS 


Using  the  syntactic  model  of  [8]  ,  we  have  developed  a  system  to 


formally  characterize  languages  which  arc  rich  in  grammatical  structure. 
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and  to  subsequently  process  strings  in  such  languages.  Such  processing  can 
produce  linear  code  (BASE  language).  The  BASE  language  contains  compu¬ 
tation  apd  data  declaration  operations  sufficient  to  accommodate  the  functions 
of  ALGOL,  FORTRAN,  and  JOVIAL.  BASE  is  expandable,  so  that  more 
convenient  or  efficient  operations  may  be  introduced  when  these  are  desirable. 
We  !.  ive  shown  the  feasibility  of  formally  characterizing  machine  (assembly) 
language,  and  of  machine- independent  translation  (BASE— ►  ML).  In  sum, 

.  "N 

we  have  presented  a  rigorously  based,  machine -independent  compiler 
generation  system. 

A -Consequence  of  th«oe  results  is  that  language  invariance  can  be  main¬ 
tained  from  machine  to  machine.  It  is  possible  to  have  a  standard  version 
of  each  procedure- oriented  language,  rather  than  machine- dependent  variants. 

The  system  is  presently  running  on  the  CDC  If  f4  computer.  Spec¬ 
ifications  of  AX.GOL,  FORTRAN,  and  JOVIAL  have  been  written,  as  has 
machine  specification  for  the  CDC  1604.  The  ALCQL  and  FORTRAN 
specifications  have  undergone  tentative  checkout  and  modification,  as  has 
the  CDC  1604  specification.  Prelimininary  comparisons  of  operating 
characteristics  have  been  made.  For  a  small  number  of  short  programs, 
our  system  produces  object  programs  about  the  iame  size  as  do  the 
manufacturer- supplied  compilers,  and  requires  between  twice  and  three 
times  the  computer  time.  Since  our  system  is  a  prototype,  these  results 
indicate  that  it  may  be  possible  to  generate  compiler /translator  systems 
which  have  competitive  efficiencies.  We  contemplate  major  operational 
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changes,  without  the  sacrifice  of  theoretical  rigour,  which  should  increase 
systein  speed  by  a  factor  of  between  3  and  5. 

The  compiler  (POL-* «*>  BASE)  portion  of  this  system  has  other  uses. 
The  ability  to  formally  characterize  grammatically  rich  languages  and  to 
subsequently  process  strings  in  such  languages  is  of  importance  wherever 
string- structure- dependent  processing  is  required. 
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